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Abstract 



g_) ' We implement the Wang-Landau algorithm in the context of SU{N) lattice gauge theo- 

^ : 

ries. We study the quenched, reduced version of the lattice theory and calculate its density 
^ I of states for A^ = 20,30,40,50. We introduce a variant of the original algorithm in which 

the weight function used in the update does not asymptote to a fixed function, but rather 
continues to have small fluctuations which enhance tunneling. We formulate a method to 



C^^ I evaluate the errors in the density of states, and use the result to calculate the dependence 

0|0 . of the average action density and the specific heat on the 't Hooft coupling A. This allows 

us to locate the coupling At at which a strongly first order transition occurs in the system. 
/\ • For A^ = 20 and 30 we compare our results to those obtained using Ferrenberg-Swendsen 

multi-histogram reweighting and find agreement with errors of 0.2% or less. Extrapolating 
our results to A^ = cxd we find (A^)" = 0.3148(2). We remark on the significance of this 
result for the validity of quenched large- A^ reduction of SU{N) lattice gauge theories. 



I. INTRODUCTION 

Strongly first order phase transitions provide a difficult challenge for numerical simula- 
tions. Consider, for example, a physical system whose interactions are characterized by a 
single coupling g. A naive approach to estimate the transition coupling, gt, is to perform 
Monte-Carlo (MC) simulations at couplings that are close to gt and locate the point at 
which different observables are discontinuous. These measurements, however, are affected 
by a strong hysteresis whose width grows with the number of degrees of freedom A'^^of . For 
large enough A'dof, this width dominates the error in the transition coupling, which can 
result in large uncertainties (10% — 20% in the example we consider here). 

To obtain improved precision the way forward is undoubtedly to use reweighting al- 
gorithms. For example, Ferrenberg-Swendsen reweighting (FSR) uses MC simulations to 
measure the normalized histogram of the action A for a coupling g = go which is close to 
dt [ll- By construction, this is given by 

ho{A) ~ p{A) X PBoltzmann(^0; ^) , (1-1) 

where p{A) is the density of states and -pBoitzmannls'o; ^) is the Boltzmann weight. In the 
SU{N) lattice gauge theories we consider here, g is typically identified with the bare lattice 't 
Hooft coupling A, and -PBoitzmann(fi'o; A) ~ exp(A/A). A measurement of ho{A) thus provides 
an estimate for p{A). Using this, one can estimate the histogram h\{A) at any other coupling 
by reweighting: 

hx{A) ~ hoiA) X exp (A/X - A/Aq) . (1.2) 

One then determines the coupling A^ at which the corresponding histogram hx^{A) takes a 
double-peak form, as expected for a ffist-order phase transition. In practice this amounts 
to calculating the average action A and its associated specific heat C as a function of A 



^(A) = dA hx{A) A, (1.3) 

C(A) = J dA hx{A) {A-A{X)f, (1.4) 

and finding the coupling A^ at which C(A) peaks. 

The FSR method has an obvious shortcoming. When reweighting from Aq to A one 
is "amplifying" the contribution to p{A) from field configurations that are important at 
A while suppressing those relevant at Aq. If, however, the field configurations probed at 



Ao are substantially different from those important at A, then this amplification can be 
dominated by statistical noise. This "overlap" problem can cause a large systematic error 
which may be hard to evaluate. To avoid it one needs to ensure that the field configurations 
that are important at A are reasonably sampled when performing measurements at Aq. In 
ordinary situations this means that the couplings Aq and A need to be sufficiently close. 
When A ~ Aj, however, there are field configurations which are very hard to probe. These 
are the tunneling configurations between the two phases. Thus for reweighting to work in 
the context of locating a strongly first order phase transition, one requires that a sufficient 
number of tunneling events are observed whilst measuring the histograms. This requirement 
can be very restrictive when A^dof is large because the tunneling probability typically falls 
exponentially as A'^dof increases. Consequently, when performing reweighting, it is crucial to 
use an algorithm that encourages tunneling events. 

In this paper we do not discuss all the different alternatives to FSR (which are discussed, 
for example, in Ref. 2f|).^ Instead we choose to study (a variant of) the Wang-Landau^ (WL) 
reweighting algorithm, which was introduced in the field of statistical mechanics \Q\, and 
is particularly well suited for promoting tunneling. In the context of gauge theories, this 



algorithm can be considered to be a modern incarnation of the early attempts, such as the 
ones in Ref. [7|, to calculate p{A) of lattice gauge theories (see also Refs. |8|, |9|]). 

A sketch of the WL algorithm is as follows (a more precise definition will be given in 
Sec. mil) . From here on we use the action density E ~ A/N^of as our prime observable, and 
so denote the density of states by p{E). We denote the Monte-Carlo time by t and the WL 
estimate of the density of states at time t by pt{E). 

1. Begin at MC-time t = with an initial estimate for the density of states, Po{E). 

2. Use l/pt{E) as a Boltzmann weight to create a series of field configurations. 

3. Update PtiE) -^ pt+i(E) = pt(E) + Sp(E). The update function 6p(E) depends on 
the MC history between times t and t + 1 in a way that biases against small or null 



^ One attractive option is the multi-canonical algorithm of Ref. g . We did not use this approach because it 
had been found in Ref. [J] , which studied a model similar to ours, that a very delicate tuning of parameters 
was needed for large enough systems. One advantage of the WL algorithm is that it is self-tuning. An 
alternative, applied successfully in Ref. [5|, is to use the WL algorithm to provide an estimate of the 
weight function of the multi-canonical algorithm. 



changes of E at time t + 1, and so encourages tunnelings. This is an essential point in 
Wang-Landau reweighting (WLR) and we discuss it in greater detail in Section lllli 

4. GoTo step (2). 

One can show, with some assumptions, that for large enough t, pt{E) converges to the 
vicinity of p{E), and subsequently fluctuates around it. We provide this demonstration in 
Sec. IIIIBl generalizing the discussion in Ref. [lOJ]. The fluctuations are an intrinsic part of 
the WL algorithm, and are the consequence of the ergodicity enforced by the "biasing" in 
step (3) above. Once converged, the algorithm generates a chain of field configurations that 
are weighted by an approximately flat probability function 

P{E) ~ p{E) X l/pt{E) ^ E- independent . (1.5) 

Consequently, all values of E will be accessed with approximately equal probability, including 
those corresponding to tunneling events. Using the estimate of p{E) one can calculate the 
specific heat C(A) and locate its peak. 

In this paper we adapt the WL algorithm to SU{N) gauge theories and in particular 
formulate a systematic way to evaluate errors in derived quantities such as C(A). The 
model we choose to study is obtained from four-dimensional SU{N) lattice gauge theories 
by "quenched reduction" to a single lattice site (see, for example, Refs. [Uj and the recent 
review in Ref. [12] )• It is a matrix model of four SU{N) matrices. The interactions between 
these matrices are governed by the 't Hooft coupling A, and lead to a nontrivial change 
in various expectation values as one moves from strong to weak couplings. This behavior 
becomes a strongly first order transition when N -^ oo and it is this transition we wish to 
analyze using the WL algorithm. 

As noted above, we use a variant of the WL algorithm. The key difference between 
our variant and the original WL algorithm ("WL''"), is that the latter includes an iterative 
procedure which we do not use.^ Namely, in WL", the steps (1-4) above are first applied with 
a given update function, 6pi{E), for some Monte-Carlo time Ti. The time Ti is determined 
"on the fly" by requiring that the values of E that are visited are sufficiently uniform. Once 
the chosen criterion is fulfilled, the function 6pi{E) is replaced by 6p2{E) which is smaller. 



^ A less important difference is that one must adapt the original algorithm from systems with discrete 
variables to those with continuous degrees of freedom. We describe how this has been done below. 



i.e. obeys \Sp2{E)\ < \6p2{E)\. The procedure is then iterated until the magnitude of Sp{E) 
drop below the machine accuracy. As shown in 13j, and discussed below (for example see 
Section HVl) . the tunneling rate, which is what one wishes to increase in the WL algorithm, 
decreases as the size of Sp{E) is decreased, making the WL° less and less efficient as it 
is iterated. For this reason we keep Sp{E) finite, and thus always have a Boltzman weight 
which varies (albeit by a small amount) so as to maintain the tunneling rate. This also avoids 
the need to tune extra parameters, such as the choice of the flatness criterion. Despite the 
lack of a fixed weight, we can measure expectation values since pt{E) fiuctuates around the 
correct value. 

A different solution to the tunneling problem, involving ultimately fixed weights, is pre- 
sented in Ref. [jj]. 

The outline of the paper is as follows. We first introduce the matrix model that we study 
in Sec. HIl We describe the Wang-Landau algorithm and its properties in Sec. Illlt and in 
Sec. IIVI we describe our implementation and in particular the tuning of parameters and the 
calculation of errors. In Sec.|V]we report our results and compare them to corresponding data 
obtained using Ferrenberg-Swendsen reweighting and standard Monte-Carlo simulations. We 
summarize in Sec. IVIt and remark on the implication of our results to the validity of large- iV 
quenched reduction of SU{N) lattice gauge theories. Appendix [X] includes a description of 
the different update algorithms which we use, and Appendix [B] discusses additional technical 
issues related to the implementation of the Wang-Landau algorithm. 

Our results for the transition coupling A( were already quoted in Ref. [12]. 

II. QUENCHED-REDUCED SU{N) LATTICE GAUGE THEORIES 

In this section we briefly describe the matrix model which we study. For a discussion of 



its relevance to SU{N) gauge theories we refer to Ref. 121] and references therein. 



A. Definition of the matrix model 



The model consists of four SU{N) matrices {V^ ; p = 1,2,3,4}. Observables are built 
from the SU{N) 'link matrices' f/^ deflned by 

U, ^ V,A,Vi, (2.1) 



where A^ are the fixed, diagonal SU{N) matrices 

j^ab ^ gab ^ipl . ^a ^ Jq^ ^Tl] , a, 6 G [l, A^] . (2.2) 

The quenched momenta p^ are drawn from some distribution — various possibihties are dis- 
cussed in Ref . 12| . Since our focus here is on the algorithm, we pick one choice of momenta 



(the "clock" momenta). 



Pl='%{^-^) ' «e[l>^]' (2-3) 



and use it throughout. Expectation values of an observable 0(LI) are calculated via 

{0{U))=Z{h)-^ fUDV, expibA) 0{U) , (2.4) 

where here the action A is 

A = AT ^ 2Re Tr (u^UMlUl) , (2.5) 

and b is the inverse of the 't Hooft coupling, b = 1/A. The partition function Z{b) is 

Z{b)= fUDV^expibA), (2.6) 

and DV^ is the Haar measure on SU{N). The integral over V^ includes matrices that realize 
permutations in the indices a of p" and so the construction above is invariant under such 
permutations. Thus one can equally define the model with any set of p" obtained from 
Eq. (12.31) by permuting the a indices, independently in each direction. 
We take the action density to be 

so that it is the average, normalized plaquette. We consider, for simplicity, only even values 
of A^, for which E lies in the range [—1, 1]. 

B. A sketch of the phase diagram 



In Ref. 12j we mapped the phase diagram of the model in b, and saw strong evidence 



that there exists a first order phase transition at 6 = 6t ~ 0.3. This was also seen in earlier 
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FIG. 1: Hysteresis plots of the average action density E versus b = 1/A for SU{50) ([blue] crosses 



SU{80) ([magenta] open squares), and SU{100) ([light blue] filled squares). For details see Ref. J12l |. 



studies of the model (for example in Ref. 15|). To demonstrate this we present in Fig. [T] our 
results for {E){b), obtained using conventional MC simulations (using algorithms described 
in Appendix. [B]). A clear hysteresis is seen, with width increasing with N, as expected since 
A^dof oc A^^. Our aim in this paper is to develop a method that can accurately locate the 
coupling bt at which this transition occurs. 



III. WANG-LANDAU REWEIGHTING 



Reweighting methods start by integrating out all but a few variables (usually one or two) 
from the partition function. We use a single remaining variable, the action density E: 



Z{h) = fDVexp{bA)= f dE p{E) exp (l2N^ b E 
= fdEexp{uj{E) + 12NHEy 



(3.1) 
(3.2) 



Here p{E) is the number density of field configurations with action density in the range 
[E, E + dE] and uj{E) is the associated "entropy": 

coiE) = log (piE)) . (3.3) 

The expectation value of an observable, 0{E), that depends solely on E can be written as 

{0) = Z-\b) fdEexp{uj{E) + 12 NHe) 0{E). (3.4) 

Thus, calculating the function uj{E) can be considered as solving the theory in the sector 
that couples to operators of the form 0(E). 

n 

The original WL algorithm was introduced in Ref. p] to study statistical systems with 
discrete degrees of freedom. E then takes discrete values, and this is reflected in the for- 
mulation of the original algorithm. In our case, however, E is continuous, and we need to 
adapt WLR accordingly. Two alternative approaches have been considered in the statistical- 
mechanics and molecular-dynamics literature: (1) Discretize E into bins and then apply the 
discrete WL algorithm. This approach has been used, for example, to study the classical 
Heisenberg model [l6|; (2) Geiieralize the WL algorithm so that it updates uj{E) treating 



E as a continuous variable 



17 



18|.^ Based on preliminary studies, we chose to pursue only 
option (2) in detail. We follow and extend the approach suggested in Ref. 18J], which we 
next describe and analyze in some detail. 

A. The algorithm 

The algorithm proceeds by updating an estimate of the entropy, uJt{E), where t is the 
Monte-Carlo (MC) time. It also updates the histogram of the action, ht{E), which is an aux- 



iliary quantity used to estimate convergence. The steps of the algorithm are as follows jl8| : 



1. Make an initial guess for the entropy function at time t = 0, ujq{E), using any available 
prior knowledge, such as the results from a related system (e.g. a smaller value of A^ in 
our study). Set the histogram to zero: ht=o{E) = 0. Pick any starting configuration. 



^ Binning is still required to store functions like uj{E) in memory, but does not play an essential role in the 
algorithm. See Appendix [Bj for further discussion. 



2. Propose a new field configuration in an unbiased way, and accept it with probability: 

Prob(E -^ E') = min [exp {uJt{E) - cotiE')), 1 ] , (3.5) 

where E and E' are respectively the action densities of the original and proposed 
configurations. 

3. Repeat step (2) A^hit times for equilibration. 

4. Let Et be the final value of the action density after step (3). Update the entropy as 
follows: 

uJt{E) ^ uJt+i{E) = uJt{E) + ^Fs{E, Et) , (3.6) 

where 7 > and Fs is a fixed, positive function, which smears the update over a range 
of action density of width ~ 6 centered on Et, and should be invariant under Et ^^ E. 
Possible choices for Fs are discussed in Ref. |l8| — we use a simple Gaussian form 

FsiE,Et) = e-^^-^^^'/'\ (3.7) 

5. Update the histogram: 

ht{E) ^ ht+,{E) = ht{E) + 6{E - Et). (3.8) 

6. GoTo step (2). 

It is important to understand the meaning of the crucial step (4) : if the simulation has 
spent some time in the vicinity of a particular value of E, then step (4) will increase Ut in 
this region, and the update probability (13. 5p will favor motion to other regions of E — this 
is how the WL algorithm encourages tunneling. 

As we show in the following subsection, the WL algorithm converges in the sense that, for 
large enough t, uJt{E) — uj{E) fluctuates around an ii^-independent constant. This constant 
drops out when one uses Eq. (13.41) to calculate averages of physical observables, and so it is 
in this sense that 

lim UtiE) = uj{E) (3.9) 

t — >oo 

in the WL algorithm. 

In Section HVt we suggest practical ways to determine how large t needs to be, how to 
evaluate the errors in the estimate for uj{E), and how to choose appropriate ranges for the 

8 



parameters 6 and 7. In our implementation of the algorithm we restrict E to lie in the 
interval -Emin ^ E < -Emaxj which is a subset of the full range of values E can take. Apart 
from the need to begin with a configuration having E inside this range, the only change to 
the algorithm involve certain boundary effects that we discuss in Appendix [Bl 

B. Theoretical analysis of Wang-Landau re^veighting 

A theoretical analysis of the original, 'discrete' version, of the WL algorithm, including 
some of its systematic errors, was given in Ref . [10|] . In this section we extend that analysis 
to the continuous WL algorithm just described. 

Consider the probability distribution of the action density E at MC-time t after step (3) 
of the algorithm has been completed. Assuming that A^hit is large enough, this is 

Pt{E) = ^exp{uj{E)-uJt{E)), (3.10) 

where the normalization factor Zt ensures that J dE Pt{E) = 1 (where here and in the fol- 
lowing the E integral implicitly runs from -Emin to -Emax)- This is the probability distribution 
from which Et is drawn. After updating uJt{E) according to eq. (13.61) . the function pt{E) 
changes as follows 

Pt{E) -^ pt+iiE) = —— exp [uj{E) - utiE) - ^FsiE, Et)] , (3.11) 

and a simple manipulation gives 

Pt+i{E) exp[—fFsiE,Et)] 



Pt{E) (exp[-7F5(E,E,)]); 
Here the average (, )t is with respect to the probabihty distribution at time t. 



(3.12) 



{f{E)),^J dEptiE)f{E). (3.13) 

In order to understand the convergence properties of the algorithm, we need a measure of 
the closeness of the estimate uJt{E) to the true uj{E). When uJt{E) = uj{E), the probability 
distribution (13.101) is flat, i.e. Pt{E) = Pflat(-E') = l/^E. Thus one possible measure of 
convergence is 



r d±J , PtiE) r dE ^ ^ ^ , ,, , , 



dE 



This is adapted from the similar discrete quantity used in Ref . [10| • It is straightforward to 
see that fit < 0,'' with the upper bound saturated only when pt = pflat- 

We thus consider the change, A/it = /^t+i — /^t, between two adjacent time steps: 

A/xt = J{dE/AE) \og[pt+i{E)/pt{E)] (3.15) 

= J{dE/AE) {—fFs{E,Et) -\og{exp[—fFs{E,Et)])t} . (3.16) 

Our choice of Fs, eq. (13. 7p . satisfies J dE Fs{E,Et) = d^/ii for all -Emin ^ Et < -Emax (This 
is true even taking into account any boundary effects - see Appendix [B]) . Thus 

A/it = —fSV^/AE - log (exp [--fFsiE, Et)])t • (3.17) 

Since 7 > 0, the logarithm is always negative and A/it is bounded from below 

A/it > -7(5v^/AE . (3.18) 

Had this lower bound had been zero, then a monotonic convergence of /it —> as t ^ cxo 
would have been possible. A negative lower bound, however, suggests a more complicated 
behavior involving fluctuations. In the rest of this section we describe the way these fluctu- 
ations emerge and quantify how they effect fit, uJt{E) and Pt{E). 

1. Afit o-s a function oft and its ensemble average 

We begin by illustrating the possible values that A/it can take. First consider the 5 — > 
limit, in which, assuming also that 7 <^ 1, one flnds^ 

A/It = ^5^/^{pt{Et) - pfiat) + 0(7') . (3.19) 

Thus ii pt{Et) is above (below) the flat distribution value 1/AE, then A/it is positive (neg- 
ative). For large t, as we will see below, the generic size of \pt — PflatI is ~ ^/^y, so that A/Xt 
then scales as 7^'^. 

Second, assume that at time t one has uJt{E) = u{E) so that pt = paat and fit takes 
its maximum value, fit = 0. Since the algorithm updates the entropy, uJt{E) -^ LJt+i{E) = 



4 Given 1 = JdEptiE) ^ ^ dE e^°a{Pt{E)) ^ ^^^ ^^^ identity / (||) e-^(^) > exp {/ (||) f{E)]. 
^ The assumption 7 ^ 1 is valid for all our calculations since we use 7 ~ 10^^ — 10^^. 
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uJt{E) + 'jFs{E, Et), Afj,t must be negative. A simple calculation gives 

A/.. = -^^ f 1 - V2^-^] + 0(y) . (3.20) 



The size of this rather special step is parametrically smaller than the generic 0(7^/^) of the 
first example. Note that the exact density of states is not a "fixed point" of the algorithm, 
which may be surprising at first glance, but is in fact an essential feature of the algorithm. 
It ensures that the simulation explores all values of E in the desired range. 

To get a more precise measure of how Afit behaves, we calculate its expectation value 
averaged over an ensemble of simulations all starting with the same uJt{E). The result is 

(A/i,) = JdEtAfitMEt) (3.21) 

= -^ - / dEt pt{E,) log [1 - (1 - exp {-^Fs{E, Et))),] . (3.22) 

(Here the internal average is over E.) Using the identity log(l — x)~^ > x we find 

{Afit) > -^^ + JdEidE2 pt{E,) pt{E2) {1 -exp [-7^,(^1,^2)]} . (3.23) 

The kernel in the curly braces is positive semi-definite, but decreases rapidly towards zero 
for \Ei — E2\ ^ 6. It is also symmetric under Ei ■^^ E2. Consequently, the double integral 
on the r.h.s. of (13.231) provides a definition of a (smeared) inner product oi pt{E) with itself. 

2. Relation o/A/ij to pt{E) 

To make use of Eq. (13.231) we must evaluate the second term on the r.h.s. and relate 
it back to fif For that purpose we first use the kernel and define the following squared 
"distance" between two probability distributions (13.231) : 



\\Pa-Pb\\ = 



2_JdE, dE2 [pa{Ei) - p,{E^)] {1 - exp [-iFs{E^, E2)]} [paiE^) - p^iE^)] 



JidE,/AE) (dE^/AE) {1 - exp [-^FsiE,, E^)]} 

(3.24) 

The normalization is chosen so that ||Pflat|P = 1- Equation (I3.24p is a generalization of the 

standard Euclidean distance used in 10|]. In fact, if one takes 5 — > 0, the kernel becomes 

proportional to 6{Ei — E2), and one obtains (the continuous E version of) the Euclidean 

11 



For 6 > 0, the kernel gives different weights to different Fourier components (in E'-space) of 
{Pa{E) — ph{E)): wavelengths larger than 6 are included with full weight, with the weight 
decreasing to zero as the wavelength itself decreases to zero. As a result UV differences are 
filtered out. Indeed this kernel is a natural integration measure for our purposes because 
the WL algorithm only makes changes to ut which have wavelengths of 0{6) or longer. 

We can use the distance \\pt — Pflatll as an measure of the approach of pt to pgat- To 
evaluate this distance we need to calculate the integral in the denominator of Eq. (I3.24p 

J{dE2/AE) {1 - exp [-jFs{Ei, E2)]} = [1 - c]^5^/AE . (3.26) 

The constant c obeys < c < 1, and for small 7 is 

c=^ + 0(f). (3.27) 

It is independent of Ei up to boundary effects of 0{''y'^5/ AE). Ignoring these numerically 
very small effects, it is straightforward to show that 

lbt-Pflat||' = lbt||'-l. (3.28) 

Combining Eqs. ([SSSD, ^M and dSSHD, we find 

{^,.)> ''^^l-'\ \\n-P...\?-R')- (3.29) 

with R^ = c/(l - c) ~ 7/^8. 
From this it follows that 

• If \\pt — Pflatll > R then (A/ii) > and the simulation will, on average, move towards 
the desired point pt = Pfiat, at which uJt{E) = uj{E). 

• If \\pt — Pflatll < R then the lower bound on (A/it) is negative and the simulation can 
move both towards and away from pt = Pflat- 

Finally, note that when \{pt — Pflat)/Pflat| ^ 1, one can show from the definition of fit 
[Eq. (EUD] that 

1 ffdE\ ( pt{E) - p^^,{E)V 1,, ,,2 



The first approximate equahty assumes that the fluctuations of {pt{E) — pflat) are small 
(which is a good approximation at large t, as we will see shortly). The second approximate 
equality assumes that the corrections to the 5 — *> limit, Eq. (I3.25p . are small, and is thus 
only an order of magnitude approximation. 

12 



3. Behavior of pt as a function oft and estimating fluctuations 

Putting together the above ingredients, the following picture emerges. Consider the 
infinite dimensional space of probability distributions Pt{E), with distances defined by the 
Euclidean metric Eq. (13 .24^ . Let the origin be at pt{E) = psat, and denote the radial 
coordinate in this space, \\pt — Pflatll, by r. A crucial role is then played by a ball of radius 



i^y/y/S centered at the origin. The results above imply that ii pt(E) lies outside this 
ball, then the simulation will perform a directed random walk towards the ball, with steps in 
fit [and thus, from Eq. (13.301) . also in r^] of average size proportional to (r^ — i?^) x 'y6/AE. 
Since the steps get, on average, smaller as one approaches the ball, the approach to its 
surface is exponentially slowed. Individual steps, however, do not shrink to zero, so one will 
eventually end up inside the ball. Once inside, Eq. (I3.29P only gives a lower bound on {Afit), 
so we do not know its sign. The simulation may move throughout the ball, or it may cluster 
near the surface. Combining Eqs. (13.301) and (13.201) . one finds that the typical step size is 
Ar ^ 7(5 and thus much smaller than the size of the ball R ~ 7^'^. One possible behavior 
is illustrated in Fig. El 




FIG. 2: Pictorial representation of how the Monte-Carlo time history of the WL algorithm looks in 
Pt{E) space (see text). Left panel: Initial stage - convergence of pt{E) towards a 'ball' of radius R 
around pflat- The step-size gets smaller as the algorithm approaches the ball. Right panel: Second 
stage - fluctuations within the ball. If the fluctuations drive pt outside of the ball, it is driven back 
inside by the type of motion in the left panel. 
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It is clear from the foregoing that only in the second stage, when the simulation has 
reached the ball, can one use uJt{E) as an estimate of uj{E). We now calculate the size of 
the fluctuations in this estimate. This requires that we remove the overall uniform growth 
of ujt which occurs because of the addition of 'yFs{E, Et) with a uniform distribution of Et. 
To do so we write: 

UtiE)-uj{E) = C{t) + AuJtiE), with f dE AutiE) = . (3.31) 

The iJ-independent quantity C{t) is determined by the normalization condition on Aujf, 
and is a linear function of t with slope 6'J^/^T/AE. AuJt{E) contains the physically relevant 
fluctuations, since C{t) makes no contribution to observables. 



Once we are inside the ball we have \\pt — Pflatll ~ -R ~ \J^ /yS <^ 1. Our task is to use 
the definition of pt in Eq- (I3.10p to convert this into a result for the fluctuations in uot. To 
do so, we assume that once in the ball, the proximity oi pt{E) to Pfiat{E) occurs not just on 
average (as the smallness of | \pt — Pfiat 1 1 implies) but also for each E separately. Then we 
have that 

P'^^'^-P^^' ^ log (P^) = -Auj.iE) + O iAut{E)f . (3.32) 

Pflat V Pflat / 

The last step follows by expanding Zt in AuJt{E). Inserting the result (13.321) into Eq. (I3.24p 
we find the desired relation: 

II _ ||2 / dE^dE^ME^) - JUJE^)) (1 - exp [-^FsjE,, E^)]) {jUtjE^) - mjE^)) _ ^ ^ 

"^''* ^*" ^ JdE,dE,{l-exp[-^FsiE„E,)]) " ^"^""^ 

(3.33) 

Thus, once in the ball, the fluctuations in Uf are the same as those in pt. Such a "filtered" 

measure of fluctuations is sufficient, because the update of ut does not introduce UV noise. 



We conclude that Auj ^ R ^ J'-y/yS. This is the same parametric behavior as in the 
discrete WL algorithm [ld |. 

An important issue for the practical application of our variant of the WL algorithm is 
the detailed nature of the fluctuations u^E) — uJt{E). In particular, do they average to zero 
for each E once one is inside the ball? The previous analysis does not directly address 
this question. We consider it very plausible, however, that the answer is positive. This is 
because the algorithm is designed to smooth out nonuniformities in uj{E) ~uJt{E), although 
it does so with some "overshoot" which leads to the fluctuations. It would be interesting to 
extend the analysis of the algorithm to include such non-equilibrium effects. For the present, 
however, we assume that uj{E) — uJt{E) fluctuates symmetrically about zero for each E. 
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We end this subsection by estimating the parametric dependence of fluctuations in the 
histogram ht{E), at least in certain limits. The entropy is related to the histogram by a 
Gaussian transform, 

uJt{E) - uJo{E) =-i j dE' ht{E') e-(^-^')V<5^ ^ (3.34) 

where ujq is the initial guess. We will use a shorthand notation for this transform and its 
inverse: 

{u^t-ujo) =lQ{ht) , g''^{uJt-uJo) =jht. (3.35) 

Using Eq. f l3.3ip . we can write 

ootiE) - u^{E) = C{t) + {uj{E) - uooiE)) + ^ft{E) . (3.36) 



For large t, when pt is in the ball, ft = tlujij ^ fluctuates around zero with an amplitude 
of 0(1). Substituting Eq. flOHD into Eq. fl335D we obtain 

ht = ^^Q-\uj-uj,)h^G-\U)l^. (3.37) 

where we use that result that an inverse Gaussian transform of a constant is a constant. 

The subsequent analysis depends on the relative size of the second and third terms in 
(I3.37p . and thus on the accuracy of the initial guess io^iE). One extreme case is a poor 
guess, ciJo = 0. In this case the second term dominates over the third (at least for small 
enough 7) which means that if one evaluates the variance in /ij. 



bht = JjidE/AE) (ht{E) -ht)\ (3.38) 

~ht = f {dE/AE)ht{E). (3.39) 



it will have a t-independent contribution proportional to I/7. 

The other extreme is when one starts with a very good guess, ujq{E) ^ uj{E), so that 
the fluctuation term in (13.361) dominates over the second term on the r.h.s.. If so, then we 
expect a t-dependent contribution to 6ht that scales with 1/^/7. Presumably if the second 
and third terms compete, the scaling will lie somewhere between these two limiting cases. 
This appears to be the situation in many of our simulations. 

The result (13.341) tells us nothing, however, about the UV fluctuations in ht, since these 
are flltered out by the Gaussian transform. As discussed further below, we expect that this 
UV noise increases with t. In practice it is a small contribution in our simulations. 
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C. The effect of a non-equilibrated Wang-Landau simulation 

We close this section by stressing that the analysis just presented is predicated on letting 
the simulation equilibrate after an update to ujt is performed. While this equilibration is 
guaranteed if we let A^hit -^ oo, most of our runs were done with A^hit = 1- This means that 
the analysis above does not directly apply to such simulations — Pt{E) is changing after each 
update, 

ME) -. ptME) = Pt{E) + 0(7), (3.40) 

so exact equilibration cannot occur. Nonetheless, when 7 <^ 1 and pt{E) ^ pt+i{E), ap- 
proximate equilibration is possible. Thus we think it is plausible that the analysis just given 
remains applicable given 7 is small enough. We have checked this in practice by doing runs 
with iVhit ^ 1 and seeing that the results are unchanged within errors. An example is shown 
below. 

IV. IMPLEMENTING AND TUNING THE WANG-LANDAU ALGORITHM 

In this section we describe how we implement the WL algorithm in practice, how we use 
it to estimate uj{E) and derived quantities, and suggest criteria for tuning 7,5 and A^hit- 

That tuning is necessary is apparent from the analysis of the previous Section. Partic- 
ularly crucial is the tuning of 7, which involves a balance between two competing effects. 
On the one hand, 7 controls the speed with which the algorithm explores values of action 
density. If the simulation has spent some time in the vicinity of a particular value of E, 
then uJt will be increased in this region, and the update probability (13.51) will favor motion 
to other regions of E. The rate of build-up of Ut is proportional to 7, so the rate of motion 
through "i?-space" will increase with increasing 7. On the other hand, by reducing 7 one 
reduces the fluctuations in ut (since Au; oc ^/y), and correspondingly reduces statistical 
errors in quantities derived from u. 

A. Algorithm structure and tuning 7 

For our application we can restrict the range of E to [-Emm, -E'max] C [—1,1], since we 
are only interested in the transition region. The range should be large enough that the 
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errors in the quantities of interest due to this truncation are much smaller than those from 
statistics. The appropriate range in our case can be read off from the hysteresis curves of 
Fig. [H One must cover the transition region and add a conservative cushion on each side, 
and we typically use E G [0.1,0.7] (see below for all our parameter choices). Working with 
less than a third of the full range [—1,1] saves considerable computation time. 
Having made the choice of range, the algorithm proceeds in two stages. 

1. Initial stage: 

During this stage the simulation makes a directed random walk towards the "ball" in 
Pt space of radius R centered on the desired flat distribution. The algorithm explores the 
chosen range of E and transforms the starting guess ujq{E) into a reliable estimate of the 
actual entropy. 

The histogram ht{E) is a useful monitor of progress during this stage. At the beginning, 
it will build up non- uniformly because the guess for the entropy function is imperfect, but 
by the end the histogram should be growing uniformly in E. The variance of the histogram, 
6ht, will grow from its initial value of zero and then approximately saturate. This saturation 
marks the end of the initial stage. 

As discussed in the previous section, the value at which it saturates depends on the 
accuracy of the initial guess. We illustrate in Fig. [3] what happens with both a poor and a 
good guess. In the former case (data represented by [red] pluses) we start with no information 
on the entropy, i.e. ujq = 0. The histogram first increases for small E, where uj{E) is large. 
When uot{E) ~ uj{E) for these values of E, the WL random walk gradually starts exploring 
larger values of E which have lower uj{E). Eventually (not shown) the whole range is 
covered, and the histogram grows uniformly, while maintaining in its shape the "memory" 
of the initial ujq. This shape is the second term in Eq. (I3.37p . 

The case of a good guess is shown by the [green] crosses. Here we show only the histograms 
starting after 55 x 10^ updates, so as to avoid cluttering the figure. Earlier histograms are 
similarly horizontal. For these simulations the third (fluctuation) term in Eq. (13.371) may 
dominate over the second. 

The behavior of 6ht for the case of the poor guess is shown in Fig. HI The stage of rapid 
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FIG. 3: Comparison of histograms ht{E) in the initial stage with wq = (red pluses) and the 
data accumulation stage for a good initial guess for ujq (green crosses). Results using WLR for 
5C/(20) with 7 = 10~^, 5 = 0.005. The values of t that correspond to the bottom six histograms 
t = [5, 10, 15, 20, 25, 30] x lO'* full updates of the model (corresponding to values of 25-150 on the 
"MC-time" axis of the next figure). The top four histograms were obtained after [50, 55, 60, 65] x 10^ 
full updates. The definition of a "full update" in given in Appendix lAl 

growth ends when 5ht saturates to a nearly constant function of t.^ There is a small but 
noticeable residual growth in 5ht which is due, we think, to UV fluctuations in the histogram. 
As discussed in Sec. IIIIB[ these fluctuations are not suppressed by the WL algorithm, and 
we expect them to be Gaussian with a contribution to 5ht growing like y/ht- 

Once one has obtained a good estimate of uj{E) for one value of A^, one can scale it 
with A"dof oc A^^ to use as a guess for a different value of A^, or reuse it for the same A^ 
with a different 7, 5 etc.. With a good guess the initial stage is shorter,^ and, according 



^ As noted in the previous section, we expect the amplitude at saturation to scale with I/7, although we 

have jmt checked this in this case. 
^ Ref. [18| reports that introducing an update to Wt (E) which is applied simultaneously to all values of E 

can also reduce the computational cost of this initial stage by an order of magnitude. We did not test 

this extensively. 
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FIG. 4: The average histogram variance, 6ht, for the loq = data presented in Fig. [3l 



to Section Till B 31 the value of 6ht at saturation should scale more like 1/^^ than like I/7. 
An example of this situation is shown in Fig. [5] where it appears that most, if not all, of 
the data is in the "saturation regime" , although it is hard to pinpoint exactly the beginning 
of this regime because of the fluctuations. Note that the 7 = 10^'* data correspond to the 
"good guess" histograms in Fig. [3] above. The figure shows clearly that the saturated 6ht 
grows with decreasing 7. The saturated values are approximately 5 x 10~^, 8 x 10^'^ and 
2.5 X 10"^, which are roughly consistent with the expected 7-scaling. 

Finally we note that we found it useful to experiment during this initial stage with 
values for 7, and determine a lower bound such that the range [-Emiru -E'max] can be explored 
repeatedly with the available computational resources. 



2. Data accumulation stage: 



The simulation is now fluctuating around the actual uj{E) (it is in the ball - see Sec- 
tion [TTIHE]). We propose that one perform A^meas measurements of uJt{E) separated by a 
fixed number updates. In our case of a first-order transition, the gap between measurements 
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FIG. 5: The standard deviation in the average histogram, 5ht, versus MC-tinie, for SU (20) and 
7 = 10~^, 10~^, 10~^ (shown by [red] pluses, [green] crosses and [blue] stars, respectively). This 
data uses a good initial guess ujo{E) and so has a very short initial stage. Thus Sht is mostly or 
completely saturated. Results are for 6 = 0.005 and A^hit = 1- The corresponding plot for A^^it = 20 
and 7 = 10^^ is similar. 

should ideally include, on average, several tunneling events.^ The average of these measure- 
ments provides an estimate for uj{E) (up to an overall irrelevant constant). The deviation 



of this estimate from the true entropy will then scale as J^y/N^eas- 

For derived quantities such as the specific heat (11.41) . we propose calculating the errors 
using the jack-knife or similar method applied to the set of A^meas measurements of uJt{E). 
This has the advantage of automatically taking into account correlations in cases where 
we have two few tunneling events between measurements. We expect the errors in derived 



quantities to also scale as A/7/A^meas- 

We show an example of the behavior of E during this data accumulation stage in Fig. El 
The runs are the same as for Fig. [5], except that we show only the smallest and largest values 



We define a tunneling event as motion from E'min to i^max and back again. This is a conservative definition 
since a tunneling can occur without motion all the way to the edges of the range. 
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of 7 for the sake of clarity. Tunneling is clearly seen,^ with a frequency that decreases with 



gamma=1e-4 
gamma=1e-6 




20000 



40000 60000 

(MC-time) 



80000 



100000 



FIG. 6: The MC-time history of the action density E for SU{20) with 7 = 10 ^ (red pluses) and 
7 = 10~® (green crosses). In both cases 5 = 0.005 and A^^it = 1- 

decreasing 7. 

This time history allows one to understand the large fluctuations in 6ht seen for small 7 in 
Fig. [5l Before a tunneling event takes place, the histogram grows only for low (high) values 
of E, and consequently 6ht grows. After the tunneling, the previously unvisited high (low) 
range of E is explored and 6ht drops. Thus a tunneling event is manifest in the MC-time 
history of 6ht as a peak. Indeed we have confirmed that the peaks in Fig. O peaks coincide 
with tunneling events seen in the time histories of E shown in Fig. [HI 

The data accumulation stage can also be used to further tune the value of 7. Decreasing 
7 reduces errors, but also, for fixed computer time, decreases tunneling rates and thus A^meas- 
One should choose 7 to optimize the error in the derived quantities of most interest, which. 



^ It is important to keep in mind that these tunnehng histories inevitably look different from those in 
canonical simulations running at or near the transition coupling. In the latter the fluctuations in each 
phase are over a very limited range of E, while in the WL algorithm the simulation must, by construction, 
move out to the boundaries of the i?-range so that all values of E are equally populated. 
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as we have noted, are expected to scale like y 7/A^meas- 

One must also determine how to scale an optimized 7 between different values of A^. 
One criterion is to maintain the same tunneling rate. Since uj{E) is an extensive quantity 
scaling like A^dof , we expect that 7 must be scaled similarly if it is to lead to a similar rate of 
motion through ii^-space, and in particular to the same tunneling rate. We have found that 
such a scaling rule works reasonably well in practice. As an example, we show in Fig. [7|the 
comparison of the time-history for SU{20) with 7 = 10~^ and SU{50) with 7 = 3 x 10^^. 
The ratio of the 7's is 3 while the ratio in the number of degrees of freedom is (50/20)^ ^ 6. 
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FIG. 7: The MC-tiine history of the action density E using the WL algorithm. Red pluses: 5'C/(20) 
with 7 = 10"^. Green crosses: 5'[/(50) with 7 = 3 x 10^^. In both cases b = 0.005 and N\^\l=l. 

Thus they are of the same order of magnitude and we expect the tunneling rate for SU{bQ) 
to be similar to that for SU{2{]). As the Figure shows this is approximately true. This 
should be contrasted with standard MC simulations in which the tunneling rate for SU{bQ) 
is exponentially smaller, reduced in the present case by about a factor of 500 compared to 
5'f/(20). This is a striking example of the efficacy of the WL algorithm at overcoming the 
suppression of tunneling events. 
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B. Tuning 6 

The parameter S determines the width of the smearing function 'yFs{E, Ef) that is added 
to uJt{E). Since the area under 7^5 is proportional to 7 x 5, it is this product that determines 
how fast the simulation moves through ii^-space. Indeed this product enters in the bound on 
the steps in A/i^, Eq. fl3.29p . By contrast, the size of fluctuations in cuj, and thus in derived 
quantities, depends only on 7 and not on 5 (since R^ ~ 7/v^)- In light of this one wants 
to make 5 as large as possible before tuning 7. 

The upper limit on 5 is set by different considerations. As 5 increases, the resolution 
with which one obtains oj{E) is decreased, and it must not approach the width of the region 
which makes the important contributions to observables like the specific heat. Thus we 
propose that one must keep 5 ^ cr, with a the width in E of each branch of the canonical 
distribution Pc{E), 

Pc(E) ~ exp {uj{E) + 12N\e) , (4.1) 

in the vicinity of the transition coupling. This guarantees that the integral in Eq. (13. 4p can 
be evaluated accurately. We note that a can be estimated with a standard MC simulation. 
In practice we choose a value, 6 = 0.005, which clearly satisfies 5 <^ a (see Fig. [9] below) 
and do not undertake extensive investigations of the sensitivity to this choice. 

C. Tuning iVhit 

Finally, we discuss the tuning of A^hit, which we recall is the number of updates one does 
with a given uJt{E) before updating to uJt^i{E). To reduce computational effort, one wants 
to choose Ahit as small as possible. This, however, can introduce a sizable systematic error 
since the convergence of the WL algorithm is formally only guaranteed if N^^^t -^ 00. The 
lower limit Ahit depends on 7. This is because for large values of 7, the update of Eq. (13. 6p 
is very abrupt and the system will require more hits to equilibrate into the new distribution 
Pt+i{,E). Correspondingly, for the very small values of 7 that we use, the system may be 
able to equilibrate even with N\^i^ = 1. In fact we use this value for most of our runs. 

Lacking a firm theoretical foundation, it is clearly important to do numerical checks of 
the dependence on iVhit. What we find (as will be shown below) is that Ahit = 1 is acceptable 
(i.e. gives the same results as with larger values) if 7 is small enough. A possible explanation 
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for this is that the number of effective hits between updates of the entropy is larger than 
A^hit- This is because the typical change in E in an individual update is, in our simulations, 
an order of magnitude smaller than 6. Thus the system performs a random walk "inside" 
the Gaussian Fs. So, in an approximate sense, the simulations are being done with effective 
values of A^hit and 7 that are two orders of magnitude larger than the assigned values. 

We conclude this section by stressing that this systematic error should be estimated 
explicitly. This can be done by comparing results for derived quantities to those obtained 
using standard MC simulations at values of b away from the transition (so that the latter are 
reliable), and/or by checking the sensitivity of the results obtained with WLR to changes in 
7 and A^hit- Details of such checks will be described at the end of the next section. 

V. RESULTS 

We have undertaken long runs with A^ = 20 — 50 using the parameters listed in Tables [I]- 
HVl (The parameter we denote by A^bin is discussed in Appendix El) In all cases the 
measurements were separated by 10000 full updates of the model (for a definition of a 
full update see Appendix |X]) , except for the data in the last two rows of Table [H where the 
separation was by 100000 full updates. Thus for each choice of A^ and algorithm parameters, 
we perform in total (1 — 3) x 10^ full updates. 

To present our results we first define the logarithm of the canonical probability function, 
calculated at the transition coupling bt: 

uJ{E)=uj{E) + 12 NHt{N)E. (5.1) 

We write bt{N) in Eq. (15. ip to emphasize that the transition coupling bt depends on A^. 
Presenting lJ{E) and not 00 {E) makes the A^ dependence more apparent. We calculate u{E) 
by averaging over the measurements. In Fig. Owe present uJ{E)/N'^ for the gauge groups 
we studied. The values of bt{N) used to generate this figure appear in Table |VT] and we 
discuss how we obtained them below. In Fig. [9] we show the canonical probability function 
itself, i.e. exp{uJ{E)). We do not show statistical errors in either figure since the meaning of 
such an error for both uj{E) or its exponent is nontrivial: only differences of uJ{E) and ratios 
of exp(a;(£')) have physical meaning. Meaningful errors can be computed using a simple 
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FIG. 8: Our most reliable estimates of lo{E)/N'^. All results were obtained with A^hit = 1 and 6 = 
0.005, except for iV = 50 where 6 = 0.0025. The values of 7 were 10"^, 1.4x 10"^ 2.5x 10"^ 3x 10^^ 
for A^ = 20, 30, 40, 50 respectively. For presentation purposes we shift the maximum of uJ{E) to 
zero for each A^. Note that a smaller -Bmax was used for A^ = 50. 

scheme of error propagation but this is not necessary for our purposes here.^° An indication 
of the size of the uncertainty is given, however, from the "wiggles" in Fig. [9l Note that these 
are much larger than those in Fig. [8], because of the exponential enhancement. 

The expected double-peak structure is clearly seen, yet the WL algorithm has done its 
job by providing the density of states in the intermediate regime. It is noteworthy that the 
dip in uJ/N"^ grows with increasing A^. This is contrary to the usual behavior in field theories 
where the dip in this normalized quantity decreases as the Ajof increases. 



^° For example, one can estimate the statistical error in the ratio of exp{uj{E)) between adjacent values of 
E and propagate it in a stochastic manner to find the error in exp(ZU(_Ei) — lJ(E2)) for a finite difference 

iSi -Eo\. 
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FIG. 9: Our most reliable estimates of exp{uj{E)). The parameters are as in Fig. El 

Using our estimates of uj{E) we can calculate the average action density S and the 
corresponding specific heat C, 



m 

C{b) 
Z{b) 



Z-\b) / dE exp [uj(E) + 12N%E\ E, 
Z-\b) f dE exp [iu{E) + 12N%e] {E - £{b)y 
dE exp \u{E) + 12N%e] , 



(5.2) 
(5.3) 
(5.4) 



as a function of inverse 't Hooft coupling b, at least for the range of b where S lies well within 
our range [-Emin, -^max]- The results of doing so are shown in Figs. [TU]and[TTl All statistical 
errors are obtained using the jackknife method, dropping single measurements of u in turn 
from the average. 

We define the transition coupling bt to be the location of the peak in C{b) and give the 
resulting values of bt{N) in Tables Hl- IIVI The WL algorithm is clearly able to determine bt 
with high accuracy (0.1-0.2%). We note that error in bf is roughly constant as A^ increases, 
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FIG. 10: Final results for £{b) obtained using WLR. Error bars are shown at selected values of b, 
and are highly correlated. 

as long as we scale 7 so that the tunneling rate stays approximately the same (as discussed 
in the previous section). Since the number of full updates is approximately the same for all 
A^, this means that the computational effort is growing proportional to A^^. This is a much 
milder dependence than the exponential growth required for canonical simulations. 

All results for bt for a given A^ are consistent, despite the use of different values of the 
parameters of the algorithm. As a further check we have used the Ferrenberg-Swendsen 
multi-histogram reweighting method, which works well for A^ = 20, 30, but fails for A^ > 40. 
The values of bt obtained with FSR are given in Table El and agree within the very small 
errors with those from the WL algorithm. 

The situation is different for the results for the peak value of the specific heat, Ct = C{b = 
bt). Although results in Tab. [T] for A^ = 20 are consistent, those in Tabs. HTl and UTTl for 
A^ = 30 and 40 are not. In addition, we find discrepancies with results from canonical MC 
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FIG. 11: Final results for C{b) obtained using WLR. 

simulations. These are exemplified by Fig. [121 where we present the estimates oi S{b > 0.305) 
from WLR for SU{30) together with the values obtained from direct MC simulations. What 
we find is that we obtain agreement with FSR and/or canonical MC results only if either 7 
is small enough or A^hjt is large. This is presumably the realization of the systematic error 
discussed in Section [IV] and illustrates the importance of having results at more than one 
value of 7 or iVhit. 
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total number of 
tunnelings 



A^bin 



N„ 



3.222(128) X 10-3 
3.336(55) X 10-3 
3.399(55) X 10-3 
3.260(83) X 10-3 
3.402(35) X 10-3 



10-^ 


1000 


108 


~ 140 


0.29585(45) 


10"^ 


1000* 


no 


~400 


0.29597(21) 


10"^ 


4000 


100 


~ 120 


0.29544(37) 


10-5 


1000 


18 


~80 


0.29509(37) 


10-6 


1000 


20 


~ 40 


0.29585(13) 



TABLE I: Results using WLR for SU{20), using the range E G [0.1, 0.7], and setting 5 = 0.005 and 
-^hit = 1- The row denoted by a star was obtained by adding explicit permutations to the update 
of the SU(N) matrices, which were found to be accepted 20% of the time. For further details on 

n 

the importance of permutations we refer to Ref . [l^j • 



iVhit iVbi, 



Nr, 



total number 
of tunnelings 



bt 



Ct 



5.974(50) X 10-3 
5.051(260) X 10-3 
6.276(120) X 10-3 
6.395(230) X 10-3 
6.453(120) X 10-3 



2.25 X lO-'' 

2.25 X lO-'' 

2.25 X 10-^ 

1.4 X 10-5 

1.4 X 10-^ 



1 
1 
1740 
1 
1 



1500 
50000 
1500 
1500 
1500 



108 
100 
114 
102 
110 



~ 140 


0.30557(55) 


~ 180 


0.30652(80) 


~20 


0.30539(35) 


~60 


0.30569(17) 


-20 


0.30551(30) 



TABLE II: Results from WLR for SU{30), using the range E G [0.1,0.7]. All calculations were 
done with 6 = 0.005 except for that presented in the last row, for which 5 = 0.0008. 



VI. SUMMARY 



In this paper we present an implementation of a variant of the Wang-Landau reweighting 
algorithm in the context of SU{N) lattice gauge theories. ^^ 

This algorithm was introduced in the field of statistical mechanics to calculate the density 
of states of discrete spin systems. We use a generalization of the original algorithm to systems 
with continuous degrees of freedom and apply it to a matrix model that is obtained by 
quenched reduction from four- dimensional SU{N) lattice gauge theory. This matrix model 



^^ The original WL algorithm was iniplimented for C/(l) gauge theory in Ref. [5|, and used to provide an 
input weighting function for a multicanonical simulation. 
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total number of 

7 ^meas h Ct 

tunnelings 

4 X 10"'' 151 ~ 850 0.30965(75) 6.67(30) x lO^^ 

2.5 X 10^5 355 ^ 75 0.30968(20) 8.24(10) x lO^^ 

TABLE III: Results from WLR for SU{40), using the range E £ [0.1,0.7], and obtained with 
6 = 0.005, iVbin = 1875, and iVhit = 1- 

total number of 

7 ^''mcas bt Ct 

tunnelings 

3x10-5 108 ~45 0.31119(19) 9.02(12) x lO^^ 

TABLE IV: Results from the WLR for SU{50), using the range E £ [0.1,0.6], and obtained with 
6 = 0.0025, A^bin = 2100, and Ahit = 1. 

consists of four SU{N) matrices with interactions governed by the 't Hooft couphng A, and 
has a first-order strong-to-weak couphng phase transition in its large- iV hmit at A = A^. An 
accurate measurement of Af at A^ = oo is what we aimed to achieve using WLR. 

Our variant of the WL algorithm does not extrapolate the flucturations in the Boltzman 
weights towards zero, but rather retains these fluctuations at a small, non-zero value, in order 
to maintain tunneling at a first-order transition. Assuming these fiuctuations are symmetric 
around zero, we can systematically estimate the error in the density of states and in derived 
quantities such as the specific heat. We have studied the systematic errors associated with 
chosing the various parameters of the algorithm. Our most reliable WL estimates of At for 
gauge groups with A^ = 20,30,40,50 are summarized in Table |VT] and plotted in Fig. [T3] 
versus l/N"^. We fit our data to the form 

N Reweighting using bt Ct 

20 40 histograms 0.295980(48) 3.32(3) x lO^^ 

30 17 histograms 0.30551(30) 6.45(5) x lO^^ 

TABLE V: Results from FSR multi-histogram reweighting. For N = 20(30) each histogram con- 
tains an average of 10^ (5 x 10^) measurements, separated by 5 full model updates from each 
other. 
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FIG. 12: Comparison of WLR results for the average action density for SU (30), with direct mea- 
surements from standard MC simulations. The systematic error discussed in Section fill CI is clearly 
seen. 



N 


20 


30 


40 


50 


bt{N) = (Xt)],' 


0.29544(37) 


0.30569(17) 


0.30968(20) 


0.31121(19) 



TABLE VI: A summary of our most reliable WL results for bt{N) = (At) 



-1 

N ■ 



(A 



t}N 






(6.1) 



and present the results of these fits in Table IVIII We also plot the result of the linear fit in 
1/A^^ (i.e. the fit with A = whose results are presented in the first row of Table IVIip in 
Fig.IH 

We find that in the large- A^ limit (Xt)^ = 0.3142(2) - 0.3148(10), depending on the way 
we fit. These results are many standard deviations away from the value of (A)b^i]^ ~ 0.36 
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FIG. 13: The strong-to-weak transition coupling, bt = l/Aj, plotted versus 1/N'^. [Red] squares 
show our resuhs using the Wang-Landau algorithm from Table IVII The sohd blue curve is the 
linear fit described in the text (with parameters listed in the first row of Table IVTIl) . 



Type of fit 



(A 



t)oo 



B 



xVd.oI. 



A = 0,B y^O 
A^O,B ^0 



0.3142(2) 
0.3148(10) 



-0.037(65) 



-7.59(18) 
-7.06(97) 



1.45/2 
1.1/1 



TABLE VII: The fit parameters {\t)ooj ^i ^^'^ ^^ obtained from fitting the Wang-Landau data 
in Table IVTl to the form Eq. (j6.1|) . The first row shows results from a fit linear in l/N"^ (i.e. with 
A = 0). 



where the strongly first order 'bulk' transition takes place in four-dimensional SU{oo) lattice 
gauge theories. This discrepancy is one of several pieces of evidence adduced in Ref. [12 1 
for the breakdown of large- A^ quenched reduction in four- dimensional SU{N) lattice gauge 
theories, and we refer the reader to that paper for further discussion. Such a discrepancy 
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was not seen in past explorations of the matrix model partly because the phase transition 
is so strong that it is very hard to measure its transition coupling by conventional means. 
The Wang-Landau algorithm allowed us to solve this problem and to determine that there 
is a discrepancy. We conclude that the Wang-Landau algorithm can be a useful and feasible 
way to study SU{N) lattice gauge theories. 
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APPENDIX A: ALGORITHMS FOR SIMULATING QUENCHED, REDUCED 
SU{N) LATTICE GAUGE THEORY 

In our standard (non Wang-Landau) MC simulations of the model we used several dif- 
ferent algorithms to update the matrices V^. We use a standard Metropolis algorithm (M), 
a "hybrid" heat-bath -|- Metropolis (HM), and a full heat-bath (HB) — the latter including 
different types of over-relaxations. For the M and HM algorithms we generate a set of 
random SU{2) matrices at the beginning of the run and keep them in memory. For each 
matrix in this list we add to the list its inverse. The M algorithm is completely standard. 
We randomly choose an SU{2) matrix u from the list, extend it to an SU{N) matrix by 
adding I's along the diagonal, update V^ — > m V^ and accept this proposed update with the 
usual Metropolis probability. This is repeated five times for equilibration. This process is 



then repeated, following Cabibbo and Marinari 1^, for each of the [A^(A^ — l)/2] SU{2) 
subgroups of SU{N) in turn, and for each V^ in turn. 

We now describe the other algorithms, which are less standard. 

1. Hybrid Heat-bath Algorithm 



Here we use the prescription suggested in Ref. [20| to make the action linear in the link 
matrices U^. This requires a Hubbard-Stratonovich Gaussian field Q^u for each plaquette. 
It results in an effective action Acs{U^, Q^v) that is quadratic in Q^y and linear in U^ — and 
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thus quadratic in V^. We then update one of the V^ as follows: 

1. First update the matrices Qf^^u- This update is trivial since Q has a (shifted) Gaussian 
distribution. 

2. Update V^ as in the M algorithm using all SU{2) subgroups but now with the action 

This is repeated in turn for each of the links. 

2. Heatbath and over-relaxation algorithms 

The heatbath algorithm requires the use of two auxiliary fields in order to obtain an action 



which is linear in the V^. It is not quite as simple as applying the approach of Ref. 20 1 
twice, and so we give some details. 
We begin by recalling the action 

A = 2NY.ReTT [u^U^Uluf) , (Al) 

where U^ = V^A^V^^ and {A^)ab = (^a6exp(ip^). We next define two sets of unitary matrices 
V = V;tK = Al^ , B,, ^ A,,KA\, ^ Bl , (/i ^ z/) , (A2) 

in terms of which the action can be written as 



A = 2Ar^ ReTr (VA^At^A^VAl^^A^) , (A3) 

= 2Nj2Re Tr {b.^AIbI^A,) . (A4) 

For each plaquette, i.e. for each fi < u we introduce auxiliary complex fields Qfj^i, and P^,y, 
with Boltzmann weights 

exp[-bN Tr (Q.^QD - bN Tr (P.^Pl)] . (A5) 

These are then shifted as follows: 

Qm- = Q^u + {b^,,AI} , (A6) 

Q^, = {Q^,, A^} = {q,,,, A^} + 2B^, + A^B^^Al + A^B^^A^ , (A7) 

B^u = r^y + A^yAy + Q i^^y A^y . (Ao) 
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The staple-like quantity X^y can then be calculated: 

X,y = KlPl + PlQl,\ Xy, = Xl, (/i<z/) (A9) 

Finally, the action can be written in a form suitable for a heatbath or overrelaxed update: 

^' = A - iV ^ [Tr {Q,.Ql) + Tr {P.^Pl,) + const.] (AlO) 

IJ.<V 

= nY^Tv {v,x,,v}) 

-N Y: [Tr iQ,.QU) + Tr (Q^Qf^J) + Tr (P^^Pp] . (All) 

fJ,<U 

In summary, to evaluate an observable that depends only on the gauge fields one can use 

iO{U)) = \j\{DV,e'^0{U) (A12) 

= 4; / n ^V; n DQ,,DQlDP,yDPl e'^' 0{U) (A13) 

= 4; /n^^M n DQ,yDQlDP,yDPle'^'0{U), (A14) 

where 

Z' = / n ^^M n DQ^,DQ\yDP,,DPl e'^ . (A15) 

^i. ijl<u 

The form ( 1A14I) . together with the expression for A' in eq. (lAllI) . shows how the auxiliary 
fields decouple the y's. To get the correctly distributed y's and P's, one can update V^ 
using the "staple" part of the action: 

Astapie(V;) = 2Ar ^ Re Tr {V^X^X) , (A16) 

where we stress that in this case the sum is now only over v. This form is suitable for a 
heat-bath update, which we implement in each SU{2) subgroup in turn. 

To generate the correct distribution of the P's is straightforward. Given the Q's and V^s, 
one can generate P's with the Gaussian measure and make the shifts given in eq. flASp . This 
leads to the correct linear and quadratic terms in P^^ in the action of eq. (lAlip . 



To update the Q's one must be more careful. Simply generating Q's with Gaussian 
measure and using the shift of eq. (1A6I) leads to the wrong distribution: neither the quadratic 
or the linear terms in eq. ( JAllI) are reproduced. Instead, one should "complete the square" 
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using the terms that are present in eq. (jAlip . To do so requires that one first generate the 



Q's using the Gaussian measure, but then shifts and rescales as follows: 

{Q^.u)ab = aab{Qf,,)ab + albii^uPl^, Ai})ab , (A17) 

«afe = , , ^=- (A18) 

^3 + 2cos(p»-p^) 

Thus the structure of the algorithm is as follows. One begins with an initial choice of V^'s 
and Q's. Then one can update all the P's, update all the Q's, and finally update the V^'s 
(updating all directions for given Q's and P's). To return to the beginning of the loop one 
needs to store not only the V's but also the Q's. One could also interchange the ordering 
and roles of the P's and Q's. 

Once one has an effective action that is linear in the U{N) matrices V^ the way is open 
for over-relaxation algorithms. We have thus implemented both an over-relaxation in all the 
SU{2) subgroups of SU{N) as well as a full SU{N) over-relaxation of the type described in 



2l|. 



3. Update scheme in the Wang-Landau algorithm 

In the WL algorithm we need to propose changes to the V^. This we do as in the 
Metropolis and HM algorithms, i.e. one SU{2) subgroup at a time. Such an update is 
what we refer to as a "hit", so if A^hit = 1 we change uJt{E) after each individual SU{2) 
multiplication. We also checked that updating uj{E) in between full SU{N) updates (for all 
four V^) gives similar results — this gives Ahit = 2 x A^(A^ ~ !)• Iii either case, we call a "full 
update" the update of all SU{2) subgroups for all four links. 

Finally, we have considered an extra type of update which permutes the angles p" ^^ p'' 
for randomly chosen pairs of indices a and b. This was motivated by the importance of 



permutations in this quenched-reduced model 



121. 



APPENDIX B: MORE PRACTICAL ISSUES 

In this section we list several practical issues relevant to the implementation of WLR. 
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1. A initial guess for ujt=o{E) 

We suggest performing the first implementation of the WLR with a relatively low value 
of A^dof = ^dof- This run can begin with a 'blind' initial guess of uJo{E; N^Ji) = 0. We then 
found it useful to appropriately scale the best estimate of uj{E, N^Ji) with A^dof , so as to use 
it as a good initial guess for A^dof > ^dof- Since lu{E) is an extensive quantity this means 
setting 

u^t=oiE; AtW) = -g co^^ooiE; AT^). (Bl) 

^dof 

In our case, with A^dof ~ A^^, this corresponds to uJt=o{E; Nf) = f^j uJt=oo{E; Nq). The 
generalization for a field theory in a finite lattice volume is obvious. We found that this 
procedure shortens the initial stage of WLR considerably. 

2. Boundary effects 

As we mention in Section [IV] it is useful to use WLR in a subset of the full range of E. 
This is sufficient if at the values of b of interest, the average action density S is localized far 
from the regime's boundaries. 

This modification complicates the theoretical analysis of Sec. IIIIBI in a way we only 
partially addressed. 

The presence of the boundaries raises two practical questions 

1. What do we do when the update i^oid -^ -E'new results in a value E-^ew which is outside 
of the region [-Emin, -En-—^ "^ 



^max 



2. What do we do when we update cuj -^ Ut+i and the update function Es{E, Et) extends 
outside the desired region? 



In this work we generalize the proposals of Ref. [22|. The answer to the first question is 
that we reject -Enew and thus perforce stay inside the desired region. This means setting 
ii^new = -Eoid and updating u;j(ii^new) as in a regular update. This is the standard approach, 
which one can understand as follows. If one were simulating the full range of E then every 
time one left the range [-Emin, -Emax] one would eventually return. We are just dropping the 
MC-time history of the WL algorithm for which E was outside the desired range. Since at 
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time t the WL algorithm updates uJt{E) only around Ef, performing WLR in this way gives 
an estimate for uj{E) which is correct away from the boundaries E = -Emin,max- 

Our answer to the second question is to "reflect" the part of Es{E, Et) which lies outside 
of the desired range back into the range. The precise definition of the reflection is as follows. 
For all E (even those outside [-Emin, -Emax]) perform 

uj{E') -^ uj{E') + ^F{E, Eom) , (B2) 

with 

E for E G [-Emiiij-E'max], 

E' = I 2E^^ -E ioT E< E^i„, (B3) 

2-E'max — E ioT E > -Emax- 

Here we assume that E' always obeys E' G [-Emim -S'max]- This is valid as long as the interval 
size (-Emax — -S'min) IS larger than the average change | i^new — -S'oid I ; which is very well satisfied 
in practice. ^^ 

The reflected update has two important properties. First, it maintains the result that 
the area J dE Fs{E,Et) is independent of Et. Second, for our Gaussian choice of Fs, the 
definition remains symmetric: Fs{E,Et) = Fs{Et,E). 

We find that if one does not perform the updates that corresponds to last two rows 
in Eq. flB3p . then the WLR fails to converge, and effectively overestimates uj{E) near the 
boundaries. This is easy to understand: if we do not reflect the contribution of the Gaussian 
in Eq. ( ]B2[) then effectively the update to uj{E) close to the boundary is smaller than it 
would be in the absence of the boundary. ^'^ Thus the force that drives one away from the 
boundary is too weak and one spends too much time updating uJt{E) near the boundary. 

3. Storing the functions uJt{E) and ht{E) 

Despite the continuous fashion in which we implement the WL algorithm, one still needs 
to store the functions uJt{E) and ht{E) in memory, as well as to update them. We do this 



^^ We note in passing that once Fg drops below a certain size (10~^ was our choice) we set it to zero. This 
saves calculation time since one needs to update lo{E) only for a subset of E, and also avoids the problem 
of double and higher-order reflections. 

^^ To demonstrate this one can take the limit of 5 — > and reproduce the case discussed in [2^] ■ We also note 
that maintaining the area under the F^ plays an important role in the theoretical analysis of Sec. IIII Bl 
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by dividing the range [-Emin, -^max] into A^bin bins. The criterion that determines the bin size 
6E = (-Emax — -Emin)/^bin IS the Same as that for 6 (see Sec. lIVBp . In order for the error in 



the numerical evaluation of the integral Eq. (13.41) by binning remain small, one must have 

6E<^a,, (B4) 

where a is the width in E of the canonical distribution function Pc{E) of Eq. (14. ip . Assuming 
that uj{E)/N'^ is quadratic about the peak and has a good N -^ oo limit (which appears to 
hold for the "outside" branches of the peaks — see Fig. [HI), then one can show that a oc 1/A^. 
Thus one must increase the number of bins as A^bin oc A^, as we have done (see Tables [I1 HV|) . 
Another criterion one might consider using is to enforce a relationship between 6E and 
the average step size. Naively one might think that the bins should be smaller than the 
average step size, so that discretization effects do not hinder the motion in i^-space. This 
would be an onerous requirement, since our step size, which is ~ (1 — 5) x 10^^, would 
require significantly more bins than we use in most simulations. In fact, it turns out that 
the acceptance rate, and thus the motion through ii^-space is almost independent of the bin 
size. We have seen this numerically for SU (30), where we have one simulation with 50000 
bins. But one can also understand this analytically if the step size is much smaller than a, 
which is the case for our simulations. We do not present the derivation, but the essential 
point is that with bins much larger than the step size the smaller acceptance when jumping 
between bins (when the step is to lower E) is exactly counterbalanced by the free motion 
(without rejection) within the bins. 



[1] A. M. Ferrenberg and R. H. Swendsen, Phys. Rev. Lett. 61 (1988) 2635. Phys. Rev. Lett. 63, 

1195 (1989). 
[2] B. A. Berg, "Introduction to Markov Chain Monte Carlo Simulations and their Statistical 



Analysis," World Scientific 2004. arXiv:cond-mat/0410490 



[3] B. A. Berg and T. Neuhaus, Phys. Rev. Lett. 68 (1992) 9 |arXiv:hep-lat/9202004] . 



[4] M. Campostrini, Nucl. Phys. Proc. Suppl. 73, 724 (1999) |arXiv:hep-lat/ 9809072 j . 

[5] B. A. Berg and A. Bazavov, Phys. Rev. D 74, 094502 (2006) [arXiv:hep-lat/06050 19 1 

[6] F. Wang and D. P. Landau, Phys. Rev. Lett. 86 (2001) 2050. 

[7] G. Bhanot, K. Bitar and R. Salvador, Phys. Lett. B 188, 246 (1987). 

39 



[8] Z. Fodor, S. D. Katz and C. Schmidt, JHEP 0703, 121 (2007) |arXiv:hep-lat/0701022] . 
[9] A. Denbleyker, D. Du, Y. Liu, Y. Meurice and A. Velvtskv. [ arXiv:0807.0185] [hep-lat]. 



[10] Chenggang Zhou, R. N. Bhatt, Phys. Rev. E 72, 025701(R) (2005), [a?Xiv:cond-mat/0306711 
[11] G. Bhanot, U. M. Heher and H. Neuberger, Phys. Lett. B 115, 237 (1982). A. A. Migdal, 

Phys. Lett. B 116, 425 (1982). D. J. Gross and Y. Kitazawa, Nucl. Phys. B 206, 440 (1982). 

G. Parisi, Phys. Lett. B 112, 463 (1982). G. Parisi and Y. C. Zhang, Nucl. Phys. B 216, 408 

(1983). G. Parisi and Y. C. Zhang, Phys. Lett. B 114 (1982) 319. 
[12] B. Bringoltz and S. R. Sharoe. [a?Xiv:0805.2146l [hep-lat]. 
[13] S. Trebst, D. A. Huse, and M. Troyer, Phys. Rev. E 70, 046701 (2004). 
[14] P. Dayal, S. Trebst, S. Wessel, D. Wiirt, M. Troyer, S. Sabhapandit, and S. N. Coppersmith, 

Phys. Rev. Lett. 92, 097201 (2004). 
[15] M. Okawa, Phys. Rev. Lett. 49, 705 (1982). 
[16] G. Brown and T. C. Schulthess, J. Appl. Phys. 97, 10E303 (2005), S. Sinha and S. K. Roy, 



larXiv:0711. 10311 

[17] A. Laio and M. Parrinello, Proc. Nat. Acad. Sci. U.S.A. 99, 12562 (2002), 
www.pnas.org/cgi/doi/10.1073/pnas.202427399. Y. Wu, J. D. Schmitt, and R. Car, J. Chem. 
Phys. 121, 1193 (2004). 

[18] Chenggang Zhou, T. C. Schulthess, Stefan Torbrgge, D. P. Landau, Phys. Rev. Lett. 96, 



120201 (2006), |arXiv:cond-m at/0509335 . 
[19] N. Cabibbo and E. Marinari, Phys. Lett. B 119 (1982) 387. 
[20] K. Fabricius and O. Haan, Phys. Lett. B 143, 459 (1984). 
[21] M. Creutz, Phys. Rev. D 36, 515 (1987). P. de Forcrand and O. Jahn, Nucl. Phys. Proc. 



Suppl. 129, 709 (2004) [a rXiv:hep-lat/0309153] . J. Kiskis, R. Narayanan and H. Neu- 



berger, Phys. Lett. B 574, 65 (2003) |arXiv:hep-lat/0308033 1 . P. de Forcrand and O. Jahn, 



arXiv:hep-lat/0503041 



[22] B. J. Schulz, K. Binder, M. Miiller, and D. P. Landau, Phys. Rev. E. 67, 067102 (2003), 



arXiv:cond-mat/0302123 



40 



